Skip to content

Updated Unicode code points for equilibrium arrows#450

Merged
NSoiffer merged 2 commits intodaisy:mainfrom
brichwin:update_mhchem_arrow_codepoints
Jan 8, 2026
Merged

Updated Unicode code points for equilibrium arrows#450
NSoiffer merged 2 commits intodaisy:mainfrom
brichwin:update_mhchem_arrow_codepoints

Conversation

@brichwin
Copy link
Contributor

@brichwin brichwin commented Jan 7, 2026

It looks like the Unicode chars for the equilibrium arrows are incorrect in the yaml files.

In Rules/Intent/general.yaml lines 364-386 - I've added comments below for the existing code points in the file:

   # this captures the output for the mhchem's "<=>", "<<=>", and "<=>>" output (there are no Unicode arrows for them)
   # this isn't a perfect match, but should be good enough and allows merging all three (see github.com/NSoiffer/MathCAT/issues/60)
   name: chemistry-mhchem-equilibrium-arrow
   tag: mover
   match:
   -    "*[1][substring(., 1, 1)='↽'] and"
   -    "*[2][substring(., string-length(), 1)='⇀']"
   replace:
   - intent:
      name: "chemical-arrow-operator"
      children:
      - test:
          if: "*[1][self::m:mrow]"
          then_test:
              if: "*[2][self::m:mrow]"
              then: [t: "🣒"]    # new in Unicode 17.0 (<=>)          -  FOUND: U+01F8D2
              else: [t: "🣔"]    # new in Unicode 17.0  (<<=>)       -  FOUND: U+01F8D4
          else: [t: "🣓"]        # new in Unicode 17.0  (<==>>).   -  FOUND:  U+01F8D3

Based on fileformat.info/info/unicode/version/17.0/index.htm and the mhchem markup comments in the ()'s above it looks like the existing unicode chars are incorrect. Here are the definitions I found:

U+1F8D1     LONG RIGHTWARDS HARPOON OVER LONG LEFTWARDS HARPOON
U+1F8D2     LONG RIGHTWARDS HARPOON ABOVE SHORT LEFTWARDS HARPOON
U+1F8D3     SHORT RIGHTWARDS HARPOON ABOVE LONG LEFTWARDS HARPOON

Thus, should the chars in the yaml files be:

              then: [t: "🣒"]    # new in Unicode 17.0 (<=>)          -  U+01F8D1
              else: [t: "🣔"]    # new in Unicode 17.0  (<<=>)       -  U+01F8D3
          else: [t: "🣓"]        # new in Unicode 17.0  (<==>>).   -  U+01F8D2

And in Rules/Languages/*/SharedRules/general.yaml, they appear incorrect as well. Here is what I found:

- name: chemical-arrow-operator
  tag: chemical-arrow-operator
  match: "."
  replace:
  # FIX: this might be better/more efficient if in unicode.yaml
  - bookmark: "@id"
  - test:
    - if: ".='→' or .='⟶'"
      then_test:
        if: "$Verbosity='Terse'"
        then: [t: "forms"]      # phrase(hydrogen and oxygen 'forms' water )
        else: [t: "reacts to form"]      # phrase(hydrogen and oxygen 'reacts to form' water)
    - else_if: ".='⇌' or .='🣒'"  -  FOUND: U+01F8D2
      then: [t: "is in equilibrium with"]      # phrase(a reactant 'is in equilibrium with' a product)
    - else_if: ".='🣔'"  -  FOUND: U+01F8D4
      then: [t: "is in equilibrium biased to the left with"]      # phrase(the reactant 'is in equilibrium biased to the left with' the product)
    - else_if: ".='🣓'" -  FOUND:  U+01F8D3
      then: [t: "is in equilibrium biased to the right with"]      # phrase(the reactant 'is in equilibrium biased to the right with' the product)
      else: [x: "*"]

Should they be:

- name: chemical-arrow-operator
  tag: chemical-arrow-operator
  match: "."
  replace:
  # FIX: this might be better/more efficient if in unicode.yaml
  - bookmark: "@id"
  - test:
    - if: ".='→' or .='⟶'"
      then_test:
        if: "$Verbosity='Terse'"
        then: [t: "forms"]      # phrase(hydrogen and oxygen 'forms' water )
        else: [t: "reacts to form"]      # phrase(hydrogen and oxygen 'reacts to form' water)
    - else_if: ".='⇌' or .='🣑'" -  U+01F8D1
      then: [t: "is in equilibrium with"]      # phrase(a reactant 'is in equilibrium with' a product)
    - else_if: ".='🣓'"  -  U+01F8D3
      then: [t: "is in equilibrium biased to the left with"]      # phrase(the reactant 'is in equilibrium biased to the left with' the product)
    - else_if: ".='🣒'"-  U+01F8D2
      then: [t: "is in equilibrium biased to the right with"]      # phrase(the reactant 'is in equilibrium biased to the right with' the product)
      else: [x: "*"]

When I made the above changes, the output is as I expect it to be at least.

Copy link
Collaborator

@NSoiffer NSoiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes. In VSCode, they chars just display as empty boxes so it is hard to see when I have a copy/paste error in them.

@NSoiffer NSoiffer merged commit bfc7ce3 into daisy:main Jan 8, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants